LLM scaling laws Flash News List | Blockchain.News
Flash News List

List of Flash News about LLM scaling laws

Time Details
2026-01-07
23:01
Karpathy Reveals nanochat Scaling-Law Breakthrough: Compute-Optimal LLMs on 8x H100 for about $100, CORE-Score Benchmarks vs GPT-2/3

According to @karpathy, nanochat’s first public miniseries v1 demonstrates compute-optimal LLM training across model sizes at fixed FLOPs with an end-to-end pipeline and reproducible scripts. source: @karpathy on X Jan 7, 2026; nanochat GitHub discussion #420 He reports nanochat reproduces Chinchilla-like scaling with equal exponents on parameters and data near 0.5 and a single compute-independent constant of about 8 tokens per parameter versus 20 reported in Chinchilla. source: @karpathy on X Jan 7, 2026; Hoffmann et al. 2022 Chinchilla The sweep from d10 to d20 achieves non-intersecting training curves at batch sizes around 2^19 (about 0.5M) on one 8x H100 node without gradient accumulation. source: @karpathy on X Jan 7, 2026 He aligns nanochat with GPT-2 and estimated GPT-3 using the CORE score for an objective cross-series comparison. source: @karpathy on X Jan 7, 2026; DCLM paper (CORE score) The total experiment cost is about $100 for roughly 4 hours on 8x H100, with all tuning and code pushed to master for reproduction via scaling_laws.sh and miniseries.sh. source: @karpathy on X Jan 7, 2026; nanochat GitHub discussion #420 This implies roughly $3.1 per H100 GPU-hour for the described run, offering a live reference for pricing compute in AI workloads. source: calculation based on @karpathy on X Jan 7, 2026 For crypto markets, decentralized GPU networks that price or facilitate GPU time make these cost and scaling benchmarks directly relevant to workload pricing and benchmarking on networks like Render Network (RNDR) and Akash Network (AKT). source: Render Network documentation; Akash Network documentation

Source
2025-10-01
17:09
Andrej Karpathy on Sutton’s Bitter Lesson: LLM Scaling Limits, RL-First Agents, and the AI Trading Narrative to Watch

According to @karpathy, Richard Sutton questions whether LLMs are truly bitter-lesson‑pilled because they depend on finite, human-generated datasets that embed bias, challenging the idea that performance can scale indefinitely with more compute and data, source: @karpathy. Sutton advocates a classic RL-first architecture that learns through world interaction without giant supervised pretraining or human teleoperation, emphasizing intrinsic motivation such as fun, curiosity, and prediction-quality rewards, source: @karpathy. He highlights that agents should continue learning at test time by default rather than being trained once and deployed statically, source: @karpathy. Karpathy notes that while AlphaZero shows pure RL can surpass human-initialized systems (AlphaGo), Go is a closed, simplified domain, whereas frontier LLMs rely on human text to initialize billions of parameters before pervasive RL fine-tuning, framing pretraining as "crappy evolution" to solve cold start, source: @karpathy. He adds that today’s LLMs are heavily engineered by humans across pretraining, curation, and RL environments, and the field may not be sufficiently bitter‑lesson‑pilled, source: @karpathy. Actionably, he cites directions like intrinsic motivation, curiosity, empowerment, multi‑agent self‑play, and culture as areas for further work beyond benchmaxxing, positioning the AI‑agent path as an active research narrative, source: @karpathy.

Source